-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness gates implementation for eager mode #130
Readiness gates implementation for eager mode #130
Conversation
f37bc11
to
a486d9b
Compare
@eytan-avisror what is the best place to check readiness gates for lazy mode? |
Codecov Report
@@ Coverage Diff @@
## master #130 +/- ##
==========================================
- Coverage 66.97% 66.10% -0.87%
==========================================
Files 10 10
Lines 872 891 +19
==========================================
+ Hits 584 589 +5
- Misses 257 271 +14
Partials 31 31
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
a486d9b
to
c0f583e
Compare
Signed-off-by: Oleg Atamanenko <[email protected]>
c0f583e
to
680aaf7
Compare
Tested eager mode and confirmed feature works as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
Will give some time before merging in case @shrinandj has any comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks for implementing it @uthark !
* Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]>
Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: Alfredo Garo <[email protected]>
* Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> * Adding Functional Test (keikoproj#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> * release 0.13 (keikoproj#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> * bump version (keikoproj#116) Signed-off-by: shreyas-badiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (keikoproj#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (keikoproj#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> * Fix typo in README.md. (keikoproj#125) Signed-off-by: shreyas-badiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> * Upgrade to Go 1.15 (keikoproj#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (keikoproj#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> * Readiness gates implementation for eager mode (keikoproj#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Signed-off-by: Alfredo Garo <[email protected]>
* Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: sbadiger <[email protected]> * Validation step to check Nodes and ASG launch configs (#112) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes (#120) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement * Terminate unjoined nodes * Resolving PR comments Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: sbadiger <[email protected]> * Fix bug when switching to launch templates (#136) * Update rollingupgrade_controller.go * Update rollingupgrade_controller.go Signed-off-by: Eytan Avisror <[email protected]> * spacing fixes Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Extract script runner to a separate type; fix work with env. variables (#132) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.15 (#137) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.16-dev. Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Propagate parent env variables to allow to talk with API Server (#144) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump Golang CI action to fix failed CI run (#146) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify (#145) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add Expiration to cache and do not refresh ASG if cache is not expired (#143) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix documentation for uniform across AZ Update strategy and fix typos (#147) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move cluster state from package level to a cluster state impl (#148) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify work with intstr type. (#149) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * If instance is in standby mode already, just return (#138) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Handle terminated instances gracefully. (#150) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Template version comparison fix (#155) * get template version Signed-off-by: Eytan Avisror <[email protected]> * fix tests Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.16 (#157) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version to 0.17-dev (#158) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151) * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Test node uncordon when postDrain / postDrainWait script fails Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * Abort on strategy failure instead of continuing (#152) * Abort on strategy failure instead of continuing Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Remove unformatted error message placeholder Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Explictly specify strategy for tests Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * use NamespacedName (#160) Signed-off-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.17 (#161) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.18-dev (#162) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move constants to types so that they can be reused (#167) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Remove separate module for pkg/log (#168) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump dependencies. (#169) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * use standard fmt.Errorf to format error message; unify error format (#171) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix namespaced name order (#170) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add instance id to the logs (#173) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump golang and busybox (#172) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Expose template list and other execution errors to logs (#166) * Log and return wrapped launchtemplate error Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Expose execution error in logs Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * output can contain other messages from API Server, so be more relaxed (#174) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * validate() function Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * basic script_runner without any modifications Signed-off-by: sbadiger <[email protected]> * Fix all the vet related errors Signed-off-by: sbadiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Eytan Avisror <[email protected]>
* Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented RollingUpgrade object validation. (#176) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix all the "make vet" errors in Controller V2 branch. (#177) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: sbadiger <[email protected]> * Validation step to check Nodes and ASG launch configs (#112) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes (#120) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement * Terminate unjoined nodes * Resolving PR comments Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: sbadiger <[email protected]> * Fix bug when switching to launch templates (#136) * Update rollingupgrade_controller.go * Update rollingupgrade_controller.go Signed-off-by: Eytan Avisror <[email protected]> * spacing fixes Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Extract script runner to a separate type; fix work with env. variables (#132) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.15 (#137) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.16-dev. Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Propagate parent env variables to allow to talk with API Server (#144) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump Golang CI action to fix failed CI run (#146) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify (#145) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add Expiration to cache and do not refresh ASG if cache is not expired (#143) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix documentation for uniform across AZ Update strategy and fix typos (#147) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move cluster state from package level to a cluster state impl (#148) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify work with intstr type. (#149) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * If instance is in standby mode already, just return (#138) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Handle terminated instances gracefully. (#150) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Template version comparison fix (#155) * get template version Signed-off-by: Eytan Avisror <[email protected]> * fix tests Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.16 (#157) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version to 0.17-dev (#158) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151) * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Test node uncordon when postDrain / postDrainWait script fails Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * Abort on strategy failure instead of continuing (#152) * Abort on strategy failure instead of continuing Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Remove unformatted error message placeholder Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Explictly specify strategy for tests Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * use NamespacedName (#160) Signed-off-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.17 (#161) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.18-dev (#162) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move constants to types so that they can be reused (#167) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Remove separate module for pkg/log (#168) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump dependencies. (#169) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * use standard fmt.Errorf to format error message; unify error format (#171) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix namespaced name order (#170) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add instance id to the logs (#173) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump golang and busybox (#172) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Expose template list and other execution errors to logs (#166) * Log and return wrapped launchtemplate error Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Expose execution error in logs Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * output can contain other messages from API Server, so be more relaxed (#174) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * validate() function Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * basic script_runner without any modifications Signed-off-by: sbadiger <[email protected]> * Fix all the vet related errors Signed-off-by: sbadiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Controller v2: Implementation of Instance termination (#178) * fix make vet errors. Signed-off-by: sbadiger <[email protected]> * Terminate instances and run v2 for first time. Signed-off-by: sbadiger <[email protected]> * Addressing review comments Signed-off-by: sbadiger <[email protected]> * addressing more review comments Signed-off-by: sbadiger <[email protected]> * Log error message Signed-off-by: sbadiger <[email protected]> * error handling for instance tagging Signed-off-by: sbadiger <[email protected]> * Migrate Script Runner (#179) * Basic script runner Signed-off-by: Eytan Avisror <[email protected]> * Update upgrade.go Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented node drain. (#181) Signed-off-by: sbadiger <[email protected]> * Eager mode implementation (#183) * Eager mode implementation Signed-off-by: sbadiger <[email protected]> * Metrics features (#189) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Process the batch rotation in parallel (#192) * Process the batch rotation in parallel Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195) Signed-off-by: sbadiger <[email protected]> * Refine metrics implementation to support goroutines (#196) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore generated code (#201) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix bug in deleting the entry in syncMap (#203) Signed-off-by: sbadiger <[email protected]> * Unit tests for controller-v2 (#215) * Unit tests Signed-off-by: sbadiger <[email protected]> * minor change in accessing the namespace name Signed-off-by: sbadiger <[email protected]> * move helper functions to a differnt file Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * Create RollingUpgradeContext (#234) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve compile errors caused by merge conflict. (#235) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis (#239) Signed-off-by: sbadiger <[email protected]> * metricsMutex should be initialized (#240) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Load test fixes (#245) * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis Signed-off-by: sbadiger <[email protected]> * load test fixes Signed-off-by: sbadiger <[email protected]> * handle scaling group not found Signed-off-by: sbadiger <[email protected]> * Update upgrade.go Signed-off-by: sbadiger <[email protected]> * log one level up * remove double logging Signed-off-by: sbadiger <[email protected]> * final push before RC release. (#254) * support IgnoreDrainFailures flag Signed-off-by: sbadiger <[email protected]> * add else condition Signed-off-by: sbadiger <[email protected]> * set min for maxUnavailable Signed-off-by: sbadiger <[email protected]> * calculateMaxUnavailable function Signed-off-by: sbadiger <[email protected]> * add a new coloumn (completePercentage) Signed-off-by: sbadiger <[email protected]> * disable debug logs by default Signed-off-by: sbadiger <[email protected]> * Fix metrics collecting issue (#249) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Revert "Fix metrics collecting issue (#249)" (#256) This reverts commit f5dd1cb. Signed-off-by: sbadiger <[email protected]> * Fix metrics calculation issue (#258) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Add mutex for InProcessingNode deleting Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add a mock for test and update version in Makefile (#262) Signed-off-by: sbadiger <[email protected]> * and CR end time (#264) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: expose totalProcessing time and other metrics (#265) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait (#270) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Add nodeEvents handler instead of a watch handler (#272) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Process next batch while waiting on nodeInterval period. (#273) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> * Set nextbatch to standBy while waiting for terminate * Avoid parallel reconcile operation per ASG * add default requeue time Co-authored-by: Sheldon Shao <[email protected]> Signed-off-by: sbadiger <[email protected]> * fix unit tests Signed-off-by: sbadiger <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Co-authored-by: Sheldon Shao <[email protected]>
* upgrade-manager-v2: Fix unit tests (#275) * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented RollingUpgrade object validation. (#176) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix all the "make vet" errors in Controller V2 branch. (#177) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: sbadiger <[email protected]> * Validation step to check Nodes and ASG launch configs (#112) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes (#120) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement * Terminate unjoined nodes * Resolving PR comments Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: sbadiger <[email protected]> * Fix bug when switching to launch templates (#136) * Update rollingupgrade_controller.go * Update rollingupgrade_controller.go Signed-off-by: Eytan Avisror <[email protected]> * spacing fixes Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Extract script runner to a separate type; fix work with env. variables (#132) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.15 (#137) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.16-dev. Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Propagate parent env variables to allow to talk with API Server (#144) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump Golang CI action to fix failed CI run (#146) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify (#145) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add Expiration to cache and do not refresh ASG if cache is not expired (#143) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix documentation for uniform across AZ Update strategy and fix typos (#147) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move cluster state from package level to a cluster state impl (#148) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify work with intstr type. (#149) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * If instance is in standby mode already, just return (#138) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Handle terminated instances gracefully. (#150) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Template version comparison fix (#155) * get template version Signed-off-by: Eytan Avisror <[email protected]> * fix tests Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.16 (#157) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version to 0.17-dev (#158) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151) * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Test node uncordon when postDrain / postDrainWait script fails Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * Abort on strategy failure instead of continuing (#152) * Abort on strategy failure instead of continuing Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Remove unformatted error message placeholder Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Explictly specify strategy for tests Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * use NamespacedName (#160) Signed-off-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.17 (#161) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.18-dev (#162) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move constants to types so that they can be reused (#167) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Remove separate module for pkg/log (#168) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump dependencies. (#169) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * use standard fmt.Errorf to format error message; unify error format (#171) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix namespaced name order (#170) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add instance id to the logs (#173) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump golang and busybox (#172) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Expose template list and other execution errors to logs (#166) * Log and return wrapped launchtemplate error Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Expose execution error in logs Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * output can contain other messages from API Server, so be more relaxed (#174) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * validate() function Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * basic script_runner without any modifications Signed-off-by: sbadiger <[email protected]> * Fix all the vet related errors Signed-off-by: sbadiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Controller v2: Implementation of Instance termination (#178) * fix make vet errors. Signed-off-by: sbadiger <[email protected]> * Terminate instances and run v2 for first time. Signed-off-by: sbadiger <[email protected]> * Addressing review comments Signed-off-by: sbadiger <[email protected]> * addressing more review comments Signed-off-by: sbadiger <[email protected]> * Log error message Signed-off-by: sbadiger <[email protected]> * error handling for instance tagging Signed-off-by: sbadiger <[email protected]> * Migrate Script Runner (#179) * Basic script runner Signed-off-by: Eytan Avisror <[email protected]> * Update upgrade.go Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented node drain. (#181) Signed-off-by: sbadiger <[email protected]> * Eager mode implementation (#183) * Eager mode implementation Signed-off-by: sbadiger <[email protected]> * Metrics features (#189) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Process the batch rotation in parallel (#192) * Process the batch rotation in parallel Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195) Signed-off-by: sbadiger <[email protected]> * Refine metrics implementation to support goroutines (#196) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore generated code (#201) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix bug in deleting the entry in syncMap (#203) Signed-off-by: sbadiger <[email protected]> * Unit tests for controller-v2 (#215) * Unit tests Signed-off-by: sbadiger <[email protected]> * minor change in accessing the namespace name Signed-off-by: sbadiger <[email protected]> * move helper functions to a differnt file Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * Create RollingUpgradeContext (#234) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve compile errors caused by merge conflict. (#235) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis (#239) Signed-off-by: sbadiger <[email protected]> * metricsMutex should be initialized (#240) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Load test fixes (#245) * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis Signed-off-by: sbadiger <[email protected]> * load test fixes Signed-off-by: sbadiger <[email protected]> * handle scaling group not found Signed-off-by: sbadiger <[email protected]> * Update upgrade.go Signed-off-by: sbadiger <[email protected]> * log one level up * remove double logging Signed-off-by: sbadiger <[email protected]> * final push before RC release. (#254) * support IgnoreDrainFailures flag Signed-off-by: sbadiger <[email protected]> * add else condition Signed-off-by: sbadiger <[email protected]> * set min for maxUnavailable Signed-off-by: sbadiger <[email protected]> * calculateMaxUnavailable function Signed-off-by: sbadiger <[email protected]> * add a new coloumn (completePercentage) Signed-off-by: sbadiger <[email protected]> * disable debug logs by default Signed-off-by: sbadiger <[email protected]> * Fix metrics collecting issue (#249) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Revert "Fix metrics collecting issue (#249)" (#256) This reverts commit f5dd1cb. Signed-off-by: sbadiger <[email protected]> * Fix metrics calculation issue (#258) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Add mutex for InProcessingNode deleting Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add a mock for test and update version in Makefile (#262) Signed-off-by: sbadiger <[email protected]> * and CR end time (#264) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: expose totalProcessing time and other metrics (#265) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait (#270) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Add nodeEvents handler instead of a watch handler (#272) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Process next batch while waiting on nodeInterval period. (#273) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> * Set nextbatch to standBy while waiting for terminate * Avoid parallel reconcile operation per ASG * add default requeue time Co-authored-by: Sheldon Shao <[email protected]> Signed-off-by: sbadiger <[email protected]> * fix unit tests Signed-off-by: sbadiger <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add ci.yaml file Signed-off-by: sbadiger <[email protected]> * test commit to trigger ci build Signed-off-by: sbadiger <[email protected]> * move ci.yaml inside workflows Signed-off-by: sbadiger <[email protected]> * delete ci.yaml file from previous place Signed-off-by: sbadiger <[email protected]> * address lint issues Signed-off-by: sbadiger <[email protected]> * Update ci.yaml Signed-off-by: sbadiger <[email protected]> * generate coverage.txt file Signed-off-by: sbadiger <[email protected]> * fix golang lint errors Signed-off-by: sbadiger <[email protected]> * Delete delete-me.file * generate coverage.txt file Signed-off-by: sbadiger <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Co-authored-by: Sheldon Shao <[email protected]>
) * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> * Implemented RollingUpgrade object validation. (#176) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> * Fix all the "make vet" errors in Controller V2 branch. (#177) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: sbadiger <[email protected]> * Validation step to check Nodes and ASG launch configs (#112) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes (#120) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement * Terminate unjoined nodes * Resolving PR comments Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: sbadiger <[email protected]> * Fix bug when switching to launch templates (#136) * Update rollingupgrade_controller.go * Update rollingupgrade_controller.go Signed-off-by: Eytan Avisror <[email protected]> * spacing fixes Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Extract script runner to a separate type; fix work with env. variables (#132) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.15 (#137) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.16-dev. Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Propagate parent env variables to allow to talk with API Server (#144) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump Golang CI action to fix failed CI run (#146) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify (#145) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add Expiration to cache and do not refresh ASG if cache is not expired (#143) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix documentation for uniform across AZ Update strategy and fix typos (#147) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move cluster state from package level to a cluster state impl (#148) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify work with intstr type. (#149) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * If instance is in standby mode already, just return (#138) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Handle terminated instances gracefully. (#150) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Template version comparison fix (#155) * get template version Signed-off-by: Eytan Avisror <[email protected]> * fix tests Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.16 (#157) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version to 0.17-dev (#158) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151) * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Test node uncordon when postDrain / postDrainWait script fails Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * Abort on strategy failure instead of continuing (#152) * Abort on strategy failure instead of continuing Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Remove unformatted error message placeholder Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Explictly specify strategy for tests Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * use NamespacedName (#160) Signed-off-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.17 (#161) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.18-dev (#162) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move constants to types so that they can be reused (#167) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Remove separate module for pkg/log (#168) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump dependencies. (#169) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * use standard fmt.Errorf to format error message; unify error format (#171) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix namespaced name order (#170) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add instance id to the logs (#173) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump golang and busybox (#172) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Expose template list and other execution errors to logs (#166) * Log and return wrapped launchtemplate error Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Expose execution error in logs Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * output can contain other messages from API Server, so be more relaxed (#174) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * validate() function Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * basic script_runner without any modifications Signed-off-by: sbadiger <[email protected]> * Fix all the vet related errors Signed-off-by: sbadiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> * Controller v2: Implementation of Instance termination (#178) * fix make vet errors. Signed-off-by: sbadiger <[email protected]> * Terminate instances and run v2 for first time. Signed-off-by: sbadiger <[email protected]> * Addressing review comments Signed-off-by: sbadiger <[email protected]> * addressing more review comments Signed-off-by: sbadiger <[email protected]> * Log error message Signed-off-by: sbadiger <[email protected]> * error handling for instance tagging Signed-off-by: sbadiger <[email protected]> * Migrate Script Runner (#179) * Basic script runner Signed-off-by: Eytan Avisror <[email protected]> * Update upgrade.go Signed-off-by: Eytan Avisror <[email protected]> * Implemented node drain. (#181) * Eager mode implementation (#183) * Eager mode implementation Signed-off-by: sbadiger <[email protected]> * Metrics features (#189) Signed-off-by: xshao <[email protected]> * Process the batch rotation in parallel (#192) * Process the batch rotation in parallel Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195) Signed-off-by: sbadiger <[email protected]> * Refine metrics implementation to support goroutines (#196) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code (#201) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix bug in deleting the entry in syncMap (#203) Signed-off-by: sbadiger <[email protected]> * Unit tests for controller-v2 (#215) * Unit tests Signed-off-by: sbadiger <[email protected]> * minor change in accessing the namespace name Signed-off-by: sbadiger <[email protected]> * move helper functions to a differnt file Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> * Create RollingUpgradeContext (#234) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> * Resolve compile errors caused by merge conflict. (#235) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> * add missing parenthesis (#239) * metricsMutex should be initialized (#240) Signed-off-by: xshao <[email protected]> * upgrade-manager-v2: Load test fixes (#245) * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis Signed-off-by: sbadiger <[email protected]> * load test fixes Signed-off-by: sbadiger <[email protected]> * handle scaling group not found Signed-off-by: sbadiger <[email protected]> * Update upgrade.go Signed-off-by: sbadiger <[email protected]> * log one level up * remove double logging Signed-off-by: sbadiger <[email protected]> * final push before RC release. (#254) * support IgnoreDrainFailures flag Signed-off-by: sbadiger <[email protected]> * add else condition Signed-off-by: sbadiger <[email protected]> * set min for maxUnavailable Signed-off-by: sbadiger <[email protected]> * calculateMaxUnavailable function Signed-off-by: sbadiger <[email protected]> * add a new coloumn (completePercentage) Signed-off-by: sbadiger <[email protected]> * disable debug logs by default Signed-off-by: sbadiger <[email protected]> * Fix metrics collecting issue (#249) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> * Revert "Fix metrics collecting issue (#249)" (#256) This reverts commit f5dd1cb5f76f2b78cb15c53daed14032a2a4c6ec. * Fix metrics calculation issue (#258) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Add mutex for InProcessingNode deleting Signed-off-by: xshao <[email protected]> * Add a mock for test and update version in Makefile (#262) Signed-off-by: sbadiger <[email protected]> * and CR end time (#264) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: expose totalProcessing time and other metrics (#265) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait (#270) Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Add nodeEvents handler instead of a watch handler (#272) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sheldon Shao <[email protected]> * upgrade-manager-v2: Process next batch while waiting on nodeInterval period. (#273) * upgrade-manager-v2: remove function duplicate declaration. (#266) * and CR end time Signed-off-by: sbadiger <[email protected]> * expose totalProcessing time and other metrics Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * remove function duplication Signed-off-by: sbadiger <[email protected]> * Carry the metrics status in RollingUpgrade CR (#267) * Update metrics status at same time Signed-off-by: xshao <[email protected]> * Update metrics status when terminating instance Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> * Add terminated step Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * move cloud discovery after nodeInterval / drainInterval wait Signed-off-by: sbadiger <[email protected]> * Add watch event for cluster nodes instead of API calls Signed-off-by: sbadiger <[email protected]> * upon node deletion, remove it from syncMap as well Signed-off-by: sbadiger <[email protected]> * Add nodeEvents handler instead of watch handler Signed-off-by: sbadiger <[email protected]> * Ignore Reconciles on nodeEvents Signed-off-by: sbadiger <[email protected]> * Add comments Signed-off-by: sbadiger <[email protected]> * Set nextbatch to standBy while waiting for terminate * Avoid parallel reconcile operation per ASG * add default requeue time Co-authored-by: Sheldon Shao <[email protected]> * upgrade-manager-v2: Fix unit tests (#275) * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented RollingUpgrade object validation. (#176) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix all the "make vet" errors in Controller V2 branch. (#177) * Validation step to check Nodes and ASG launch configs Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Validating launch definition after a rolling upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve error log message and return statement Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolving PR comments Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix typo in README.md. (#125) Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore the terminated instance during upgrade Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Added WARNING prefix in the logging Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Apply suggestions from code review Co-authored-by: Kevin Downey <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Capitalize sprintf to Sprintf Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Upgrade to Go 1.15 (#128) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix few typos and simplify error returns, remove redundant types (#131) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Readiness gates implementation for eager mode (#130) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * Adding Functional Test (#113) * Adding BDD, workflow and badge * Changing CI workflow job name * Adding make manifests * Clarifying cron time zone comment Signed-off-by: sbadiger <[email protected]> * Validation step to check Nodes and ASG launch configs (#112) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.13 (#115) * release 0.13 * Update CHANGELOG.md Signed-off-by: sbadiger <[email protected]> * bump version (#116) Signed-off-by: sbadiger <[email protected]> * Repo selection for CI and BDD workflows & CI step for releases (#117) * CI-BDD not on forks & Step for releases (#2) * Testing CI-BDD not on forks & Step for releases * Adding step for image with tag git-tag Signed-off-by: sbadiger <[email protected]> * Terminate unjoined nodes (#120) * Validation step to check Nodes and ASG launch configs * Validating launch definition after a rolling upgrade * Resolve error log message and return statement * Terminate unjoined nodes * Resolving PR comments Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version 0.14. (#121) Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to 0.15-dev. Signed-off-by: sbadiger <[email protected]> * Fix bug when switching to launch templates (#136) * Update rollingupgrade_controller.go * Update rollingupgrade_controller.go Signed-off-by: Eytan Avisror <[email protected]> * spacing fixes Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Extract script runner to a separate type; fix work with env. variables (#132) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.15 (#137) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.16-dev. Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Propagate parent env variables to allow to talk with API Server (#144) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump Golang CI action to fix failed CI run (#146) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify (#145) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add Expiration to cache and do not refresh ASG if cache is not expired (#143) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix documentation for uniform across AZ Update strategy and fix typos (#147) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move cluster state from package level to a cluster state impl (#148) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Simplify work with intstr type. (#149) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * If instance is in standby mode already, just return (#138) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Handle terminated instances gracefully. (#150) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Template version comparison fix (#155) * get template version Signed-off-by: Eytan Avisror <[email protected]> * fix tests Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * release 0.16 (#157) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * bump version to 0.17-dev (#158) Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set (#151) * Don't uncordon node on failure to run postDrain script when IgnoreDrainFailures set Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Test node uncordon when postDrain / postDrainWait script fails Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * Abort on strategy failure instead of continuing (#152) * Abort on strategy failure instead of continuing Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Remove unformatted error message placeholder Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Explictly specify strategy for tests Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * use NamespacedName (#160) Signed-off-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Set version and update CHANGELOG for version v0.17 (#161) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump version to v0.18-dev (#162) Signed-off-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Move constants to types so that they can be reused (#167) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Remove separate module for pkg/log (#168) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump dependencies. (#169) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * use standard fmt.Errorf to format error message; unify error format (#171) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix namespaced name order (#170) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add instance id to the logs (#173) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Bump golang and busybox (#172) Signed-off-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Signed-off-by: sbadiger <[email protected]> * Expose template list and other execution errors to logs (#166) * Log and return wrapped launchtemplate error Signed-off-by: Adam Malcontenti-Wilson <[email protected]> * Expose execution error in logs Signed-off-by: Adam Malcontenti-Wilson <[email protected]> Signed-off-by: sbadiger <[email protected]> * output can contain other messages from API Server, so be more relaxed (#174) Signed-off-by: Oleg Atamanenko <[email protected]> Signed-off-by: sbadiger <[email protected]> * Delete README.md Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * delete all Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add API Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * initial code Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * add more scaffolding Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Add kubernetes API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * aws API calls Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * AWS API calls & Drift detection Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * validate() function Signed-off-by: shreyas-badiger <[email protected]> Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * modified validate() Signed-off-by: sbadiger <[email protected]> * initial rotation logic Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * basic script_runner without any modifications Signed-off-by: sbadiger <[email protected]> * Fix all the vet related errors Signed-off-by: sbadiger <[email protected]> Co-authored-by: Alfredo Garo <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Shri Javadekar <[email protected]> Co-authored-by: Craig Robson <[email protected]> Co-authored-by: Kevin Downey <[email protected]> Co-authored-by: Oleg Atamanenko <[email protected]> Co-authored-by: Shreyas Badiger <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Adam Malcontenti-Wilson <[email protected]> Co-authored-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Controller v2: Implementation of Instance termination (#178) * fix make vet errors. Signed-off-by: sbadiger <[email protected]> * Terminate instances and run v2 for first time. Signed-off-by: sbadiger <[email protected]> * Addressing review comments Signed-off-by: sbadiger <[email protected]> * addressing more review comments Signed-off-by: sbadiger <[email protected]> * Log error message Signed-off-by: sbadiger <[email protected]> * error handling for instance tagging Signed-off-by: sbadiger <[email protected]> * Migrate Script Runner (#179) * Basic script runner Signed-off-by: Eytan Avisror <[email protected]> * Update upgrade.go Signed-off-by: Eytan Avisror <[email protected]> Signed-off-by: sbadiger <[email protected]> * Implemented node drain. (#181) Signed-off-by: sbadiger <[email protected]> * Eager mode implementation (#183) * Eager mode implementation Signed-off-by: sbadiger <[email protected]> * Metrics features (#189) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Process the batch rotation in parallel (#192) * Process the batch rotation in parallel Signed-off-by: sbadiger <[email protected]> * addressing review comments Signed-off-by: sbadiger <[email protected]> * Move the DrainManager within ReplaceBatch(), to access one per RollingUpgrade CR (#195) Signed-off-by: sbadiger <[email protected]> * Refine metrics implementation to support goroutines (#196) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Ignore generated code (#201) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Fix bug in deleting the entry in syncMap (#203) Signed-off-by: sbadiger <[email protected]> * Unit tests for controller-v2 (#215) * Unit tests Signed-off-by: sbadiger <[email protected]> * minor change in accessing the namespace name Signed-off-by: sbadiger <[email protected]> * move helper functions to a differnt file Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * Create RollingUpgradeContext (#234) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * Resolve compile errors caused by merge conflict. (#235) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis (#239) Signed-off-by: sbadiger <[email protected]> * metricsMutex should be initialized (#240) Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * upgrade-manager-v2: Load test fixes (#245) * upgrade-manager-v2: Move DrainManager back to Reconciler (#236) * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * log cloud discovery failure Signed-off-by: sbadiger <[email protected]> * Create RollingUpgrade Context Signed-off-by: sbadiger <[email protected]> * rollingupgrade context Signed-off-by: sbadiger <[email protected]> * #2285: rollup CR statistic metrics in v2 (#218) * #2285: rollup CR statistic metrics in v2 Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> * #2285: updated metric flags Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2285: renamed some methods related to metrics (#224) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * #2286: removed version from metric namespace (#227) Signed-off-by: sbadla1 <[email protected]> Signed-off-by: sbadiger <[email protected]> * resolve compile errors due to merge conflict Signed-off-by: sbadiger <[email protected]> * move drain-manager to reconciler Signed-off-by: sbadiger <[email protected]> * initialize RollingUpgrade object Signed-off-by: sbadiger <[email protected]> * use bool instead of count for standby function Signed-off-by: sbadiger <[email protected]> * refactor in-progress and standby code Signed-off-by: sbadiger <[email protected]> * rename instance standby function Signed-off-by: sbadiger <[email protected]> * DrainManager changes in unit test files Signed-off-by: sbadiger <[email protected]> Co-authored-by: Sahil Badla <[email protected]> Signed-off-by: sbadiger <[email protected]> * V2 controller metrics concurrency fix (#231) * Refine the metrics status Signed-off-by: xshao <[email protected]> * Refine the metrics status Signed-off-by: xshao <[email protected]> * Fix test case error Signed-off-by: xshao <[email protected]> * Use group instead of ASG Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Ignore generated code Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Fix the concurrent issue Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into RollingUpgradeContext Signed-off-by: xshao <[email protected]> * Move metrics related functions into upgrade_metrics.go Signed-off-by: xshao <[email protected]> * Move metrics related functions into metrics.go Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * add missing parenthesis Signed-off-by: sbadiger <[email protected]> * load test fixes Signed-off-by: sbadiger <[email protected]> * handle scaling group not found Signed-off-by: sbadiger <[email protected]> * Update upgrade.go Signed-off-by: sbadiger <[email protected]> * log one level up * remove double logging Signed-off-by: sbadiger <[email protected]> * final push before RC release. (#254) * support IgnoreDrainFailures flag Signed-off-by: sbadiger <[email protected]> * add else condition Signed-off-by: sbadiger <[email protected]> * set min for maxUnavailable Signed-off-by: sbadiger <[email protected]> * calculateMaxUnavailable function Signed-off-by: sbadiger <[email protected]> * add a new coloumn (completePercentage) Signed-off-by: sbadiger <[email protected]> * disable debug logs by default Signed-off-by: sbadiger <[email protected]> * Fix metrics collecting issue (#249) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * Revert "Fix metrics collecting issue (#249)" (#256) This reverts commit f5dd1cb5f76f2b78cb15c53daed14032a2a4c6ec. Signed-off-by: sbadiger <[email protected]> * Fix metrics calculation issue (#258) * metricsMutex should be initialized Signed-off-by: xshao <[email protected]> * Use InProcessingNode instead of Stringp[] so that it can have the status of steps Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Make the change backward compatible Signed-off-by: xshao <[email protected]> * Add mutex for InProcessingNode deleting Signed-off-by: xshao <[email protected]> Signed-off-by: sbadiger <[email protected]> * …
Fix for #126.
This is work in progress